Last updated: 2016-08-29

Code version: 9490c5b10206b54f5bf4eebe6faed3c258c87cbf

Study design

We have collected dendritic cells (DCs) from two populations: the first are individuals that have latent TB infections (putatively resistant), and the second are individuals that have recovered from an active TB infection (putatively sensitive). We performed RNA-seq on MTB-infected DCs and mock-treated controls. Because of our small sample size, we are not performing an eQTL analysis. Instead, we are performing a differential expression (DE) analysis to find gene expression differences between susceptible and resistant individuals.

To test for DE, we used the following linear model (implemented with limma+vooom):

\[ Y\ \sim\beta_0+X_{treat}\beta_{treat}+X_{status}\beta_{status}+X_{treat,status}\beta_{treat,status}+I+\epsilon \]

where \(Y\) is the expression level of a gene, \(\beta_{treat}\) is the fixed effect of treatment with MTB, \(\beta_{status}\) is the fixed effect of susceptibility status, \(\beta_{treat,status}\) is the fixed effect of interaction between treatment and susceptibility status, and \(I\) is the random effect of individual (implemented via duplicateCorrelation).

The assumption is that genes which are differentially expressed in our in vitro system will have regulatory variants that affect TB susceptibility. In other words, we expect that SNPs nearby the differentially expressed genes will be enriched for low p-values obtained from GWAS of TB susceptibility. We would like to test this by combining the DE results with published GWAS results. Specifically we use the following framework.

  1. For each gene that was tested for DE, find all GWAS SNPs within +/- 50 kb of the TSS.
  2. Assign the minimum GWAS p-value from this set of SNPs to the gene.
  3. Test for a relationship between the DE effect size (the \(B\)’s from the model above) and the GWAS p-values.

If there is a decrease in the GWAS p-values for increasing DE effect sizes, this suggests that these genes contain nearby genetic variants that affects TB susceptibility.

In this analysis, we use the p-values from a GWAS of TB susceptibility in Ghana published in Thye et al., 2010. The initial results look promising, i.e. there is a decrease in GWAS p-values for increasing DE effect sizes. Ideally we would also test for an enrichment from additional TB GWAS.

Code

library("limma")
library("data.table")
library("biomaRt")
library("SNPlocs.Hsapiens.dbSNP144.GRCh38")
library("dplyr")
library("GenomicRanges")
library("ggplot2")
library("cowplot")

Input the results of the model fit by limma.

fit <- readRDS("../data/results-limma-fit.rds")

Input summary statistics from Thye et al., 2010.

gwas_thye_ghana <- fread("../data/OUT_PLINK_Gambia.txt", data.table = FALSE,
                         verbose = FALSE)
## 
Read 5.0% of 2406966 rows
Read 39.9% of 2406966 rows
Read 76.4% of 2406966 rows
Read 2406966 rows and 6 (of 6) columns from 0.078 GB file in 00:00:06

The file contains the chromosome, SNP rsID, odds ratio (OR), and p-value (PVAL). Because the file does not contain any allele frequency data, it should not be possible to identify study participants (Craig et al., 2011).

colnames(gwas_thye_ghana)
## [1] "CHR"   "SNP"   "TEST"  "NMISS" "OR"    "PVAL"

Assign a GWAS summary statistic to each gene

11336 genes were tested for differential expression.

gene_names <- rownames(fit$coefficients)
head(gene_names)
## [1] "ENSG00000279457" "ENSG00000188976" "ENSG00000187961" "ENSG00000187583"
## [5] "ENSG00000187642" "ENSG00000188290"

Obtain the transcription start site (TSS) for each gene.

# Ensembl 83, Dec 2015, grch38.p5, hg38
# ensembl <- useMart(host = "dec2015.archive.ensembl.org",
#                    biomart = "ENSEMBL_MART_ENSEMBL",
#                    dataset = "hsapiens_gene_ensembl")
tss_all_fname <- "../data/tss-all.rds"
if (file.exists(tss_all_fname)) {
  tss_all <- readRDS(tss_all_fname)
} else {
  tss_all <- getBM(attributes = c("ensembl_gene_id", "chromosome_name",
                                  "transcription_start_site", "strand"),
                   filters = "ensembl_gene_id",
                   values = gene_names,
                   mart = ensembl)
  saveRDS(tss_all, file = tss_all_fname)
}
head(tss_all)
##   ensembl_gene_id chromosome_name transcription_start_site strand
## 1 ENSG00000000419              20                 50958550     -1
## 2 ENSG00000000419              20                 50958555     -1
## 3 ENSG00000000419              20                 50945861     -1
## 4 ENSG00000000419              20                 50958521     -1
## 5 ENSG00000000419              20                 50958532     -1
## 6 ENSG00000000457               1                169893952     -1

This returns the TSS for each transcript of gene. To reduce this to one number, take the most upstream TSS (i.e. the most 5’ for genes on positive strand and the most 3’ for genes on the negative strand).

tss <- tss_all %>%
  group_by(ensembl_gene_id) %>%
  summarize(chr = chromosome_name[1],
            strand = strand[1],
            tss = if (strand == 1) min(transcription_start_site) else max(transcription_start_site))
head(tss)
## Source: local data frame [6 x 4]
## 
##   ensembl_gene_id   chr strand       tss
##             (chr) (chr)  (int)     (int)
## 1 ENSG00000000419    20     -1  50958555
## 2 ENSG00000000457     1     -1 169894267
## 3 ENSG00000000460     1      1 169662007
## 4 ENSG00000000938     1     -1  27635277
## 5 ENSG00000000971     1      1 196651878
## 6 ENSG00000001036     6     -1 143511690

The window to search for SNPs for a gene will be +/- 50 kb from the TSS.

window <- 50000
tss$start <- tss$tss - window
tss$end <- tss$tss + window
tss$strand <- ifelse(tss$strand == 1, "+", "-")
tss_gr <- makeGRangesFromDataFrame(tss, keep.extra.columns = TRUE)
seqlevels(tss_gr) <- paste0("ch", seqlevels(tss_gr))
tss_gr
## GRanges object with 11336 ranges and 2 metadata columns:
##           seqnames                 ranges strand   | ensembl_gene_id
##              <Rle>              <IRanges>  <Rle>   |     <character>
##       [1]     ch20 [ 50908555,  51008555]      -   | ENSG00000000419
##       [2]      ch1 [169844267, 169944267]      -   | ENSG00000000457
##       [3]      ch1 [169612007, 169712007]      +   | ENSG00000000460
##       [4]      ch1 [ 27585277,  27685277]      -   | ENSG00000000938
##       [5]      ch1 [196601878, 196701878]      +   | ENSG00000000971
##       ...      ...                    ...    ... ...             ...
##   [11332]     ch16 [ 29765952,  29865952]      +   | ENSG00000280789
##   [11333]      ch7 [ 30500761,  30600761]      -   | ENSG00000281039
##   [11334]     ch10 [ 13560047,  13660047]      +   | ENSG00000282246
##   [11335]      ch1 [111453760, 111553760]      -   | ENSG00000282608
##   [11336]      ch7 [150350702, 150450702]      +   | ENSG00000283013
##                 tss
##           <integer>
##       [1]  50958555
##       [2] 169894267
##       [3] 169662007
##       [4]  27635277
##       [5] 196651878
##       ...       ...
##   [11332]  29815952
##   [11333]  30550761
##   [11334]  13610047
##   [11335] 111503760
##   [11336] 150400702
##   -------
##   seqinfo: 25 sequences from an unspecified genome; no seqlengths

Convert rsID to genomic coordinates.

snp_coords_fname <- "../data/snp-coords.rds"
if (file.exists(snp_coords_fname)) {
  snp_coords <- readRDS(snp_coords_fname)
} else {
  snp_coords <- snpsById(SNPlocs.Hsapiens.dbSNP144.GRCh38, gwas_thye_ghana$SNP,
                         ifnotfound = "drop")
  saveRDS(snp_coords, file = snp_coords_fname)
}
stopifnot(mcols(snp_coords)$RefSNP_id %in% gwas_thye_ghana$SNP)

Overlap the SNPs with the genes.

overlaps <- findOverlaps(snp_coords, tss_gr, ignore.strand = TRUE)

How many SNPs were found for each gene?

snps_per_gene <- countSubjectHits(overlaps)
stopifnot(length(snps_per_gene) == length(gene_names))
summary(snps_per_gene)
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##    0.00   24.00   51.00   55.03   80.00  265.00
sum(snps_per_gene > 0)
## [1] 10265

How many genes were found for each SNP?

genes_per_snp <- countQueryHits(overlaps)
stopifnot(length(genes_per_snp) == length(snp_coords))
table(genes_per_snp)
## genes_per_snp
##       0       1       2       3       4       5       6       7       8 
## 1953105  311725   90118   25161    7381    2544     987     518     270 
##       9      10      11      12      13 
##     165      57      12      11       4

Convert overlap results to use original SNP and gene names.

results <- data.frame(as.matrix(overlaps))
colnames(results) <- c("rsID", "gene")
results$rsID <- mcols(snp_coords)$RefSNP_id[results$rsID]
results$gene <- mcols(tss_gr)$ensembl_gene_id[results$gene]
head(results)
##         rsID            gene
## 1  rs6603811 ENSG00000215790
## 2  rs6603811 ENSG00000008130
## 3 rs12044597 ENSG00000215790
## 4 rs12044597 ENSG00000008130
## 5  rs2272908 ENSG00000215790
## 6  rs2272908 ENSG00000008130

Add GWAS p-values.

rownames(gwas_thye_ghana) <- gwas_thye_ghana$SNP
results$gwas_p <- gwas_thye_ghana[results$rsID, "PVAL"]
stopifnot(!is.na(results$gwas_p))

For each gene, assign the minimum p-value of all its nearby SNPs.

results <- results %>%
  group_by(gene) %>%
  summarize(gwas_p = min(gwas_p),
            n_snps = n())

As expected, there is a negative correlation between the minimum GWAS p-value and the number of SNPs. However, there is no technical reason that genes assigned many SNPs should be more likely to be differentially expressed in our in vitro infection of dendritic cells with MTB.

cor(results$n_snps, results$gwas_p)
## [1] -0.4061506
plot(results$n_snps, results$gwas_p, xlab = "Number of SNPs nearby gene",
     ylab = "Minimum GWAS p-value",
     main = "Relationship between number of tested SNPs near gene\nand the minimum GWAS p-value of these SNPs")

Add coefficients from differential expression analysis.

limma_coef <- fit$coefficients
results <- merge(results, limma_coef, by.x = "gene", by.y = "row.names")

Convert the coefficients to their absolute value.

for (test in colnames(limma_coef)) {
  results[, test] <- abs(results[, test])
}

Create categorical variables to identify genes with absolute effect sizes greater or less than 1.

for (test in colnames(limma_coef)) {
  new_col <- paste0("de_", test)
  results[, new_col] <- ifelse(results[, test] > 1, "|logFC| > 1",
                               "|logFC| <= 1")
  results[, new_col] <- factor(results[, new_col],
                               levels = c("|logFC| <= 1", "|logFC| > 1"))
}
results %>% select(starts_with("de_")) %>% summary
##       de_diff_before       de_diff_after       de_treat_resist
##  |logFC| <= 1:10026   |logFC| <= 1:10167   |logFC| <= 1:7076  
##  |logFC| > 1 :  239   |logFC| > 1 :   98   |logFC| > 1 :3189  
##      de_treat_suscept      de_diff_treat  
##  |logFC| <= 1:6819    |logFC| <= 1:10111  
##  |logFC| > 1 :3446    |logFC| > 1 :  154

As a sanity check to confirm that our strategy of choosing the minimum SNP per gene is not biased, check that the mean expression level (i.e. the intercept term from the model) is not associated with lower p-values.

mean_expression_level <- fit$Amean[results$gene]
stopifnot(results$gene == names(mean_expression_level))
results$mean_expression_level <- mean_expression_level
test_intercept <- lm(gwas_p ~ mean_expression_level, data = results)
summary(test_intercept)
## 
## Call:
## lm(formula = gwas_p ~ mean_expression_level, data = results)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -0.09989 -0.06730 -0.04446  0.01148  0.89248 
## 
## Coefficients:
##                        Estimate Std. Error t value Pr(>|t|)    
## (Intercept)           0.0645165  0.0029358  21.976  < 2e-16 ***
## mean_expression_level 0.0036112  0.0005812   6.214 5.38e-10 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.1265 on 10263 degrees of freedom
## Multiple R-squared:  0.003748,   Adjusted R-squared:  0.003651 
## F-statistic: 38.61 on 1 and 10263 DF,  p-value: 5.383e-10

This is confirmed. In fact, reassuringly it is the exact opposite pattern: genes with higher mean expression (and thus with more statistical power to call differential expression) are associated with higher GWAS p-values.

Results - Enrichment of GWAS p-values

Is there an enrichment of GWAS p-values for SNPs nearby genes that are differentially expressed following treatment with MTB in resistant individuals?

dist_treat_resist <- ggplot(results, aes(x = treat_resist, y = gwas_p)) +
  geom_point(alpha = 0.5) +
  geom_vline(xintercept = 1, color = "red") +
  labs(x = "|beta| for infection", y = "GWAS p-value",
       title = "GWAS p-value vs. DE effect size")
wilcox_treat_resist <- wilcox.test(gwas_p ~ de_treat_resist, data = results)
enrich_treat_resist <- ggplot(results, aes(x = de_treat_resist, y = gwas_p)) +
  geom_boxplot() +
  labs(title = sprintf("Wilcox test p-value: %.2e", wilcox_treat_resist$p.value),
       x = "DE effect size", y = "GWAS p-value")
plot_grid(dist_treat_resist, enrich_treat_resist, nrow = 1)

Is there an enrichment of GWAS p-values for SNPs nearby genes that are differentially expressed following treatment with MTB in susceptible individuals?

dist_treat_suscept <- dist_treat_resist %+% aes(x = treat_suscept) +
  labs(x = "|beta| for infection")
wilcox_treat_suscept <- wilcox.test(gwas_p ~ de_treat_suscept, data = results)
enrich_treat_suscept <- enrich_treat_resist %+% aes(x = de_treat_suscept) +
  labs(title = sprintf("Wilcox test p-value: %.2e", wilcox_treat_suscept$p.value),
       x = "DE effect size")
plot_grid(dist_treat_suscept, enrich_treat_suscept, nrow = 1)

Is there an enrichment of GWAS p-values for SNPs nearby genes that are differentially expressed between individuals that are resistant or susceptible to TB in the noninfected state?

dist_diff_before <- dist_treat_resist %+% aes(x = diff_before) +
  labs(x = "|beta| for susceptibility status")
wilcox_diff_before <- wilcox.test(gwas_p ~ de_diff_before, data = results)
enrich_diff_before <- enrich_treat_resist %+% aes(x = de_diff_before) +
  labs(title = sprintf("Wilcox test p-value: %.2e", wilcox_diff_before$p.value),
                       x = "DE effect size")
plot_grid(dist_diff_before, enrich_diff_before, nrow = 1)

Is there an enrichment of GWAS p-values for SNPs nearby genes that are differentially expressed between individuals that are resistant or susceptible to TB in the infected state?

dist_diff_after <- dist_treat_resist %+% aes(x = diff_after) +
  labs(x = "|beta| for susceptibility status")
wilcox_diff_after <- wilcox.test(gwas_p ~ de_diff_after, data = results)
enrich_diff_after <- enrich_treat_resist %+% aes(x = de_diff_after) +
  labs(title = sprintf("Wilcox test p-value: %.2e", wilcox_diff_after$p.value),
                       x = "DE effect size")
plot_grid(dist_diff_after, enrich_diff_after, nrow = 1)

Is there an enrichment of GWAS p-values for SNPs nearby genes in which the differential expression following treatment with MTB is different between individuals that are resistant or susceptible to TB?

dist_treat_suscept <- dist_treat_resist %+% aes(x = treat_suscept) +
  labs(x = "|beta| for interaction term")
wilcox_treat_suscept <- wilcox.test(gwas_p ~ de_treat_suscept, data = results)
enrich_treat_suscept <- enrich_treat_resist %+% aes(x = de_treat_suscept) +
  labs(title = sprintf("Wilcox test p-value: %.2e", wilcox_treat_suscept$p.value),
                       x = "DE effect size")
plot_grid(dist_treat_suscept, enrich_treat_suscept, nrow = 1)

Investigating relationship between effect size and number of SNPs

Is there an increase in GWAS SNPs nearby genes that are differentially expressed following treatment with MTB?

dist_snps_treat_resist <- ggplot(results, aes(x = treat_resist, y = n_snps)) +
  geom_point(alpha = 0.5) +
  geom_vline(xintercept = 1, color = "red") +
  labs(x = "|beta| for infection", y = "Number of SNPs nearby gene",
       title = "Number of SNPs vs. DE effect size")
wilcox_snps_treat_resist <- wilcox.test(n_snps ~ de_treat_resist, data = results)
enrich_treat_resist <- ggplot(results, aes(x = de_treat_resist, y = n_snps)) +
  geom_boxplot() +
  labs(title = sprintf("Wilcox test p-value: %.2e", wilcox_snps_treat_resist$p.value),
       x = "DE effect size", y = "Number of SNPs nearby gene")
plot_grid(dist_snps_treat_resist, enrich_treat_resist, nrow = 1)

Is there an increase in GWAS SNPs nearby genes that are differentially expressed between individuals that are resistant or susceptible to TB in the noninfected state?

dist_snps_diff_before <- dist_snps_treat_resist %+% aes(x = diff_before) +
  labs(x = "|beta| for susceptibility status")
wilcox_snps_diff_before <- wilcox.test(n_snps ~ de_diff_before, data = results)
enrich_diff_before <- enrich_treat_resist %+% aes(x = de_diff_before) +
  labs(title = sprintf("Wilcox test p-value: %.2e", wilcox_snps_diff_before$p.value),
       x = "DE effect size")
plot_grid(dist_snps_diff_before, enrich_diff_before, nrow = 1)

Is there an increase in GWAS SNPs nearby genes in which the differential expression following treatment with MTB is different between individuals that are resistant or susceptible to TB?

dist_snps_diff_treat <- dist_snps_treat_resist %+% aes(x = diff_treat) +
  labs(x = "|beta| for interaction term")
wilcox_snps_diff_treat <- wilcox.test(n_snps ~ de_diff_treat, data = results)
enrich_diff_treat <- enrich_treat_resist %+% aes(x = de_diff_treat) +
  labs(title = sprintf("Wilcox test p-value: %.2e", wilcox_snps_diff_treat$p.value),
                       x = "DE effect size")
plot_grid(dist_snps_diff_treat, enrich_diff_treat, nrow = 1)

log-log plots and trend lines

I obtained significant results when splitting the data in two: genes with effect size greater than and below 1. If I perform a linear regression, only the effect of treatment is statistically significant. It is easier to see why this is when viewing the log-log transformed plot, which is always the plot in the right panel below.

Effect of treatment in resistant individuals.

summary(lm(gwas_p ~ treat_resist, data = results))
## 
## Call:
## lm(formula = gwas_p ~ treat_resist, data = results)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -0.08521 -0.06834 -0.04486  0.01102  0.88811 
## 
## Coefficients:
##               Estimate Std. Error t value Pr(>|t|)    
## (Intercept)   0.085346   0.001698   50.26  < 2e-16 ***
## treat_resist -0.004648   0.001236   -3.76 0.000171 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.1267 on 10263 degrees of freedom
## Multiple R-squared:  0.001375,   Adjusted R-squared:  0.001278 
## F-statistic: 14.13 on 1 and 10263 DF,  p-value: 0.0001711
plot_grid(dist_treat_resist + geom_smooth(method = "lm"),
          dist_treat_resist + geom_smooth(method = "lm") + scale_x_log10() + scale_y_log10())

Effect of treatment in susceptible individuals.

summary(lm(gwas_p ~ treat_suscept, data = results))
## 
## Call:
## lm(formula = gwas_p ~ treat_suscept, data = results)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -0.08501 -0.06832 -0.04482  0.01126  0.88825 
## 
## Coefficients:
##                Estimate Std. Error t value Pr(>|t|)    
## (Intercept)    0.085061   0.001707  49.830  < 2e-16 ***
## treat_suscept -0.004079   0.001175  -3.473 0.000517 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.1267 on 10263 degrees of freedom
## Multiple R-squared:  0.001174,   Adjusted R-squared:  0.001076 
## F-statistic: 12.06 on 1 and 10263 DF,  p-value: 0.0005172
plot_grid(dist_treat_suscept + geom_smooth(method = "lm"),
          dist_treat_suscept + geom_smooth(method = "lm") + scale_x_log10() + scale_y_log10())

Effect of susceptibility status in noninfected state.

summary(lm(gwas_p ~ diff_before, data = results))
## 
## Call:
## lm(formula = gwas_p ~ diff_before, data = results)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -0.08437 -0.06838 -0.04485  0.01143  0.88748 
## 
## Coefficients:
##              Estimate Std. Error t value Pr(>|t|)    
## (Intercept)  0.084475   0.001680  50.294   <2e-16 ***
## diff_before -0.014807   0.004812  -3.077   0.0021 ** 
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.1267 on 10263 degrees of freedom
## Multiple R-squared:  0.0009216,  Adjusted R-squared:  0.0008242 
## F-statistic: 9.467 on 1 and 10263 DF,  p-value: 0.002098
plot_grid(dist_diff_before + geom_smooth(method = "lm"),
          dist_diff_before + geom_smooth(method = "lm") + scale_x_log10() + scale_y_log10())

Effect of susceptibility status in infected state.

summary(lm(gwas_p ~ diff_after, data = results))
## 
## Call:
## lm(formula = gwas_p ~ diff_after, data = results)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -0.08475 -0.06858 -0.04479  0.01141  0.88648 
## 
## Coefficients:
##              Estimate Std. Error t value Pr(>|t|)    
## (Intercept)  0.084923   0.001683  50.474  < 2e-16 ***
## diff_after  -0.020792   0.006005  -3.463 0.000537 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.1267 on 10263 degrees of freedom
## Multiple R-squared:  0.001167,   Adjusted R-squared:  0.00107 
## F-statistic: 11.99 on 1 and 10263 DF,  p-value: 0.0005372
plot_grid(dist_diff_after + geom_smooth(method = "lm"),
          dist_diff_after + geom_smooth(method = "lm") + scale_x_log10() + scale_y_log10())

Effect of interaction term.

summary(lm(gwas_p ~ treat_suscept, data = results))
## 
## Call:
## lm(formula = gwas_p ~ treat_suscept, data = results)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -0.08501 -0.06832 -0.04482  0.01126  0.88825 
## 
## Coefficients:
##                Estimate Std. Error t value Pr(>|t|)    
## (Intercept)    0.085061   0.001707  49.830  < 2e-16 ***
## treat_suscept -0.004079   0.001175  -3.473 0.000517 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.1267 on 10263 degrees of freedom
## Multiple R-squared:  0.001174,   Adjusted R-squared:  0.001076 
## F-statistic: 12.06 on 1 and 10263 DF,  p-value: 0.0005172
plot_grid(dist_treat_suscept + geom_smooth(method = "lm"),
          dist_treat_suscept + geom_smooth(method = "lm") + scale_x_log10() + scale_y_log10())

Session information

sessionInfo()
## R version 3.2.1 Patched (2015-07-12 r68650)
## Platform: x86_64-unknown-linux-gnu (64-bit)
## Running under: Scientific Linux release 6.7 (Carbon)
## 
## locale:
##  [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C              
##  [3] LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8    
##  [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8   
##  [7] LC_PAPER=en_US.UTF-8       LC_NAME=C                 
##  [9] LC_ADDRESS=C               LC_TELEPHONE=C            
## [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C       
## 
## attached base packages:
## [1] stats4    parallel  stats     graphics  grDevices utils     datasets 
## [8] methods   base     
## 
## other attached packages:
##  [1] cowplot_0.6.2                           
##  [2] ggplot2_2.1.0                           
##  [3] dplyr_0.4.3                             
##  [4] SNPlocs.Hsapiens.dbSNP144.GRCh38_0.99.11
##  [5] BSgenome_1.36.3                         
##  [6] rtracklayer_1.30.4                      
##  [7] Biostrings_2.38.4                       
##  [8] XVector_0.10.0                          
##  [9] GenomicRanges_1.22.4                    
## [10] GenomeInfoDb_1.6.3                      
## [11] IRanges_2.4.8                           
## [12] S4Vectors_0.8.11                        
## [13] BiocGenerics_0.16.1                     
## [14] biomaRt_2.26.1                          
## [15] data.table_1.9.6                        
## [16] limma_3.26.9                            
## [17] knitr_1.13.1                            
## 
## loaded via a namespace (and not attached):
##  [1] Rcpp_0.12.5                plyr_1.8.3                
##  [3] formatR_1.4                futile.logger_1.4.1       
##  [5] bitops_1.0-6               futile.options_1.0.0      
##  [7] tools_3.2.1                zlibbioc_1.14.0           
##  [9] digest_0.6.9               gtable_0.2.0              
## [11] RSQLite_1.0.0              evaluate_0.9              
## [13] DBI_0.4-1                  yaml_2.1.13               
## [15] stringr_1.0.0              grid_3.2.1                
## [17] Biobase_2.28.0             R6_2.1.2                  
## [19] AnnotationDbi_1.32.3       XML_3.98-1.4              
## [21] BiocParallel_1.2.22        rmarkdown_1.0             
## [23] lambda.r_1.1.7             magrittr_1.5              
## [25] scales_0.4.0               Rsamtools_1.22.0          
## [27] htmltools_0.3.5            GenomicAlignments_1.6.3   
## [29] assertthat_0.1             SummarizedExperiment_1.0.2
## [31] colorspace_1.2-6           labeling_0.3              
## [33] stringi_1.1.1              lazyeval_0.1.10           
## [35] munsell_0.4.3              RCurl_1.95-4.8            
## [37] chron_2.3-47